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1.  INTRODUCTION 


Suppose  X  i, ...  ,Xfj  is  a  random  sample  of  positive  random  variables  from  a  distribution 
with  probability  density  function  (pdf)  at  x  equal  to  P/  e((l* ).  Here  P  is  a  scalar  precision 
parameter,  8  is  a,  possibly  vector,  shape  parameter,  and  N  is  unknown.  In  applications,  Xt  is 
often  a  length  of  time,  such  as  a  lifelength,  and  X,=x  corresponds  to  the  occurrence  of  an  event 
at  time  x .  I  shall  use  this  temporal  imagery  without  further  explanation. 

The  first  n  order  statistics,  t  =(rlt ...  ,tn),  are  observed,  where  0£f  •  •  •  <tn<T.T  is 

the  period  of  observation:  there  is  no  Xt  such  that  tn<Xi<,T .  Inference  is  to  be  made  about  the 
unknown  parameters,  and  future  observations  are  to  be  predicted. 

I  shall  call  this  the  general  order  statistic  (GOS)  model.  Special  cases  have  been  proposed 
as  models  for  market  penetration  and  capture-recapture  studies  (Anscombe  1961),  bum-in  in 
repairable  systems  (Bazovsky  1961,  chap.  8;  Cozzolino  1968),  software  reliability  growth 
(Jelinski  and  Moranda  1972;  Littlewood  1981),  estimating  the  number  of  individuals  exposed  to 
radiation  (Hoel  1968),  and  estimating  the  number  of  unseen  species  (Efron  and  Thisted  1976, 
and  references  therein). 

Perhaps  the  simplest  special  case  is  the  exponential  order  statistic  (EOS)  model  where 
/  e(jc )  =  exp(-x ),  statistical  analysis  of  which  has  been  extensively  studied  (Blumenthal  and 
Marcus  1975;  Forman  and  Singpurwalla  1977;  Goudie  and  Goldie  1981;  Jewell  1985;  Joe  and 
Reid  1985;  Raftery  1986a).  It  has  been  used  extensively  as  a  simple,  physical,  debugging  model 
for  software  reliability.  In  this  context  it  is  often  called  the  Jelinski-Moranda  model,  and  is  based 
on  the  assumption  that  a  system  has  N  faults,  each  of  which  causes  a  failure  of  the  system,  and  is 


then  located  and  removed;  the  times  at  which  the  N  failures  occur  are  independent  and 
identically  distributed  exponential  random  variables.  However,  the  examples  in  Section  6  show 
that  it  may  give  rather  optimistic  estimates  of  system  reliability. 


The  EOS  model  can  be  generalised  by  assuming  that  Xl,...,Xn  are  independent 
exponential  random  variables  with  different  means  £f  \  . . .  .J^1,  where  £lf . . .  ,^N  is  itself  a 
random  sample  from  a  distribution  with  pdf  at  4  equal  to  |3-,we(£|3-1).  This  is  a  special  case  of 
the  GOS  model,  where 

/e(jc)  =  Jy  H'eO’)exP(-j(y)<*y  (i.i) 

Miller  (1986)  has  pointed  out  that  many  proposed  software  reliability  models  are,  in  fact,  of  this 
form.  When  the  have  a  gamma  distribution,  the  X,  have  a  Pareto  distribution.  This,  the 
Pareto  order  statistic  (POS)  model,  is  discussed  in  more  detail  in  Section  5.2. 

I  adopt  a  Bayes  empirical  Bayes  approach  (Deely  and  Lindley  1981)  to  the  problem  of 
inference  for  the  GOS  model.  This  has  the  advantage  of  permitting  comparisons  between 
competing,  perhaps  non-nested,  models  for  f  q(x)  in  a  natural  way  (Section  2),  as  well  as 
providing  easily  implemented  inference  and  prediction  procedures  which  avoid  the  difficulties  of 
non-Bayesian  methods  (Section  3).  One  such  difficulty  is  that  the  maximum  likelihood  estimator 
of  N  may  be  infinite.  Indeed,  Goudie  and  Goldie  (1981)  concluded  that  for  the  special  case  they 
considered,  all  standard  non-Bayesian  point  estimation  techniques  are  liable  to  fail.  Attention  is 
paid  to  the  situation  where  vague  prior  information  about  the  model  parameters  is  approximated 
by  limiting,  improper,  prior  forms. 


II 


Some  analytic  simplification  is  possible  for  the  Weibull  order  statistic  (WOS)  model,  where 
the  X,  have  a  Weibull  distribution  (Section  5.1).  The  examples  in  Section  6  suggest  that  this 
model  may  be  promising  for  software  reliability  applications,  for  which  it  has  not  previously 
been  considered. 


2.  MODEL  COMPARISON 


Consider  the  GOS  model  described  in  Section  1.  In  this  section  and  the  next  one  I  assume 
that  0  is  known  and  omit  it  from  the  notation;  this  assumption  is  relaxed  in  Section  4. 1  assume 
that  N  has  a  Poisson  distribution  in  the  GOS  model;  this  defines  an  empirical  Bayes  model  in  the 
sense  of  Morris  (1983). 

It  is  equivalent  to  a  non-homogeneous  Poisson  process  with  \(s ),  the  intensity  function  at 
time  s ,  given  by  X(s )  =  p/  ([& )  (p>0).  The  likelihood  is 


p (r  |  p,P)  =  p"  (n/ expf-pp-1/^ C(3r)> 

i=l 


where  F  (x )  =  j  f  (y )  dy . 


Consider  the  problem  of  comparing  competing,  perhaps  non-nested,  models  for  f  (x),  M  \ 
and  M  2,  say.  Such  comparisons  will  be  based  on  the  Bayes  factor,  or  ratio  of  posterior  to  prior 


odds  for  M  j  against  M  2, 


B  xl=p(t\M  Y)lp(t  \M2) 


the  ratio  of  the  marginal  likelihoods.  In  (2.2), 
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p{t\Mt)=  f  jp(t \{>$Mi)p(p&\Mi)dpd$  (i=l,2)  (2.3) 

0  0 

If  the  priors  p  (p,P  \Mt)  (i  =  1 ,2)  are  proper,  (2.2)  can  be  evaluated  directly. 

I  now  develop  an  expression  for  5 12  in  the  situation  where  vague  prior  knowledge  is 
approximated  by  limiting,  improper,  prior  forms.  This  is  done  by  comparing  M  j  and  M  2  in  turn 
with  the  constant  rate  Poisson  process,  M0:  X(s )  =  p,  which  is  nested  within  each  of  M  x  and  M2. 
This  yields  Bayes  factors  B  01  and  B  02,  where 

B0i=p(t\M0)/p(t\Mi)  (/=1,2)  (2.4) 

and 


K 


P('\M0)=  \p(t  |p,A/0)p(li|Af0)dp  (2.5) 

0 

Then  5i2  =  fioi/fioi-  Comparison  of  M0  with  M,  using  (2.4)  may  itself  be  of  interest.  For 
example,  in  the  software  reliability  context,  it  provides  a  test  of  whether  the  system  is,  indeed, 
being  debugged. 

I  use  the  standard  vague  prior  for  p., 

p(p|A/0)  =  c0fT1  (2.6) 

(Jaynes  1968),  and  consider  the  evaluation  of  B0 1-  order  to  provide  a  satisfactory 
approximation  for  vague  prior  knowledge  over  all  scales,  the  prior  distribution  of  (p,P)  should 
yield  a  Bayes  factor  5  01  which  is  time-invariant,  i.e.  invariant  to  scale  changes  in  the  time 
variable. 


SV 


Theorem  1:  B  0i  is  time- invariant  if  and  only  if  there  is  a  function  <(>(.)  such  that 

P  (P>P  I  Mi)-c  iP-2<Kp_1p)  (2.7) 

Proof:  Suppose  B  0i  is  time-invariant.  By  (2.6) 

p(t|M0)  =  c0(n-l)!T-"  (2.8) 

Using  (2.1),  and  substituting  p T  for  p  and  P T  for  P  in  (2.3),  and  then  dividing  the  result  into 
(2.8),  yields,  by  (2.4), 

oo  oo 

B oi  =  c oi  [  J  J  P"  ihf  (P“i )>  expf-pP^F ((3)}  T~2p (pT"1^"1 1 M  x)  d p d  PF1  (2.9) 

o  0  i=l 

Thus 

T~2p  (p  T~\  pT_1|M  x)=p  (p,p  |  M ,)  (p,p,T  >0)  (2.10) 

Setting  T=p  in  (2.10)  yields  (2.7),  where  <Kx)=p(p=l,|3=*  \M {).  Also,  when  (2.7)  holds,  the 
time-invariance  of  B  01  follows  by  direct  substitution  in  (2.9).  This  completes  the  proof. 

If  the  prior  is  to  be  asymptotically  non-increasing  in  p  and  (3,  then,  by  Theorem  1,  <K*) 
must  be  bounded  above  by  Yj  and  below  by  y2*-2  for  x  sufficiently  large,  where  Yi  and  y2  are 
positive  constants.  Consider  now  the  case  where  the  likelihood  (2.1)  is  of  exponential  family 
form,  so  that 

J  d 

/(px)  =  exp{a(p)  +  a(x)+  jr(P*)  '+ const.}  (2.11) 

;=i 

This  is  quite  a  general  family,  and  includes,  for  example,  the  gamma  and  Weibull  distributions. 
By  (2.1),  a  natural  family  of  conjugate  prior  distributions  is 


p  (p,p  |  M  0  =  c  iexp  {*  Qfl  (P)  +  £  kj  p*'}  p*/+1exp  {-kU2F  (Pr )}  (2.12) 

7=1 

By  Theorem  1,  the  unique  prior  of  the  form  (2.12)  which  is  independent  of  T  and  yields  time- 
invariant  Bayes  factors  for  all  models  of  the  form  (2.1 1)  is 

p(p,P|M1)  =  CiP-2  (2.13) 

This  prior  is  also  independent  of  the  shape  parameter. 

It  follows  from  (2.9)  that,  with  the  priors  (2.6)  and  (2.13),  the  Bayes  factor  has  the  form 

B  oi  =c0l(n-\)h(u)~1  (2.14) 

where  c 0l  =  c 0/ c  v  u  =(u  v  . .  .,«„),  n^r./T  (* =1, ...,«),  and 

oo 

h  (« )  =  jy "-1  {hf  (y“i  )}F(y  T{n-l)dy  (2. 1 5) 

o  /=i 

However,  (2.14)  involves  the  arbitrary,  undefined,  multiplicative  constant  c0i,  which 
appears  because  the  priors  used  are  improper.  Akman  and  Raftery  (1986a)  have  shown  how  this 
may  be  assigned  using  the  minimal  imaginary  training  sample  idea  of  Spiegelhalter  and  Smith 
(1982).  This  consists  of  imagining  that  a  data  set  is  available  which  involves  the  smallest 
possible  sample  size  permitting  a  comparison  of  M0  and  Mj,  and  provides  maximum  possible 
support  for  M0.  it  is  then  argued  that  the  resulting  Bayes  factor,  B0i>  should  be  only  slightly 
greater  than  one.  Raftery  and  Akman  (1986)  have  applied  this  approach  to  the  change-point 
Poisson  process;  their  results  may  be  compared  with  the  non-Bayesian  solution  of  Akman  and 


Raftery  (1986b).  This  approach  has  also  been  applied  to  log-linear  models  for  contingency  tables 
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In  the  present  situation,  the  appropriate  imaginary  data  set  consists  of  two  observations  at 
the  same  value,  u  =(u  1,u2)  =  ('u,t)),  where  v  is  chosen  so  as  to  maximise  the  value  of  B0l  in 
(2.14).  In  practice,  in  all  the  examples  considered,  B  01  is  maximised  at  either  \)=1  or  u=0.  When 
B  01  is  maximised  at  u=0,  however,  the  maximum  value  is  infinite.  In  such  cases,  I  use  the  local 
maximum  at  D=l,  because  this  corresponds,  in  the  software  reliability  situation,  for  example,  to 
the  data  set  which  suggests  most  strongly  that  the  system  is  not  being  debugged.  This  yields 

oo 

cq\~  \yf  (y)2F(yYxdy  (2.16) 

o 

Strictly  speaking,  any  value  of  B 12  less  than  one  suggests  that  the  data  provide  evidence 
against  M  x  for  M  2.  However,  as  a  rough  order  of  magnitude  interpretation,  Jeffreys  (1961, 
Appendix  B)  has  suggested  that  the  evidence  should  be  regarded  as  strong  only  if  B  12<  10-1, 
and  as  decisive  only  if  B  12<  10  . 

3.  ES .  iMATION  AND  PREDICTION 

I  now  consider  estimation  of  N ,  and  prediction  of  future  observations  for  the  GOS  model. 
The  framework  developed  in  Section  2  is  used.  It  follows  from  (2.13)  that 

oo 

p (N ,P)  =  Jp  (N  I  p,p)/7 (P,p)<i p 

o 

(3.1) 


Also, 


p(t  |N,P)  =  {M/(N-«)!}^{n/(^|)}F(P7)A/-"  (3.2) 

i=l 

where  F(x)  =  l-F(x).  Combining  (3.1)  with  (3.2)  and  integrating  over  P  yields  the  posterior 
distribution  of  the  number  of  unobserved  variables  M  =N-n, 

p(M  |r)«  {(M+n-2)\/M'.}g(uM)  (Af  =0,1,  •  -  *  )  (3.3) 

where 

oo 

g(uM)  =  ^yn~l{Y\f(yui)}F(y)Mdy  (3.4) 

Point  estimators  of  N  may  be  obtained  by  combining  (3.3)  with  an  appropriate  loss 
function;  examples  are  the  posterior  mode  and  the  posterior  median.  However,  experience  with 
the  simple  EOS  model  indicates  that  point  estimators  of  N  are  liable  to  perform  badly  (Raftery 
1986a).  Interval  estimators  of  N,  such  as  highest  posterior  density  regions,  can  readily  be  found 
from  (3.3),  and  may  well  be  more  useful. 

Various  prediction  problems  may  be  of  interest,  and  can  be  solved,  often  quite  easily,  using 
the  present  approach.  One  example  is  finding  the  probability,  given  the  data,  that  there  is  no  X, 
such  that  T  <Xi<T+z .  In  the  software  reliability  context,  this  is  the  current  reliability  of  the 
system  for  a  task  of  length  z.  If  Z  =rn+1-7 ,  where  fjv+i  =  °°’  then 

P[Z>z  | r ]  =  £  \P[Z>z \t\p(M#\t)d$ 

M=  0  0 


=  P[M=0\t]  +  {T/(T+x)}n  £  {g(uT/(T+x)M)/g(u,M)}p(M  \t) 


(3.5) 


4.  SHAPE  PARAMETER  UNKNOWN 


Suppose  now  that  the  shape  parameter  0  in  the  GOS  model  is  unknown.  I  continue  to  use 
the  framework  of  Sections  2  and  3,  but  quantities  which  depend  on  0  are  now  written  with  a 
subscript  0. 1  know  of  no  single  prior  which  can  provide  a  satisfactory  approximation  for  vague 
prior  knowledge  about  0  in  all  situations.  I  therefore  assume  that 

p(p,P,e)  =  c1p-2p(0)  (4.1) 

where  p  (0)  is  proper.  I  denote  the  set  of  possible  values  of  0  by  0. 

The  results  of  Sections  2  and  3  can  be  generalised  to  this  situation  by  conditioning  on  0  and 
using  the  total  probability  law  in  an  appropriate  way.  Thus  (2.14)  becomes 

B0l-c0l(n-imu)-'  (4.2) 

where 

H(u)  =  j  hQ(u)p(Q)dQ  (4.3) 

e 

and  A  g(m  )  is  defined  by  (2.15).  (2.16)  becomes 

oo 

coi  =  J  jyf  Q(y)2^Q(y)~ldy  p{B)dQ  (4.4) 

eo 

For  estimation  of  N ,  (3.3)  becomes 

p(M  |r)~  {(M+n-2)'./M\}G(u,M)  (Af= 0,1,  •  •  • )  (4.5) 

where 


G(u,M)=  j  ge(M,M)p(0)rf0 


(4.6) 
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and  gg(uM)  is  defined  by  (3.4). 

For  the  prediction  problem  considered  in  Section  3,  (3.5)  becomes 
P[Z>z\t)  =  P[M=0\t)  +  {T/(J+x)}*  £  {G(uT/(T+x)MyG(uM))p(M  10  (4.7) 

M=1 


5.  SPECIAL  CASES 

5.1  The  Weibull  Order  Statistic  (WOS)  Model 

Among  commonly  used  models  for  positive  random  variables,  the  Weibull  distribution 
yields  some  analytic  simplification  of  the  results  in  Section  4.  The  WOS  model  is  defined  by 
setting 

/  eOO  =  0*9-1  exp(-*  ®)  (0>O)  (5.1) 

in  the  GOS  model.  Then  B  01  is  given  by  (4.2),  (4.3),  and  (4.4),  where 

h e(« )  =  e""1  (n«, J6"1  J exp(-y £ u*)  {y /(1-e )}— 1  dy 

»=i  0  i=l 

and  cOi  =  (^-l)E(0]. 

The  solutions  to  the  estimation  and  prediction  problems  are  given  by  (4.5),  (4.6),  and  (4.7). 

g  <*«  m  )  -  0"-1  (n  Mi)®-1  ( £«  m  r 

«=i  i=i 


where 


5.2  The  Pareto  Order  Statistic  (POS)  Model 

Consider  the  POS  model  described  in  Section  1,  where  in  (1.1), 

H'eO')  =  r(0)-,y0-1e-e>  (5.2) 

so  that,  by  (1.1), 

/e(y)  =  0(i+y)“(ft4'1)  (53) 

floi  is  again  given  by  (4.2)  and  (4.3),  where 

M 

h  e(« )  =  e*  fl( i+K  r^0  f  / ( i-O+y  rer("~1) 

i=l  0 

and  (4.4)  becomes 

C °,  =  |Jy ( i+y r**l) { l-(  1+y r®}" 1 <fy  &p (6 )dQ 

The  solutions  to  the  estimation  and  prediction  problems  are  somewhat  simplified  if  a 
gamma  prior  for  0  is  used,  namely,  in  (4.1), 

p  (0)  «  0K,_1  e  ~Kl®  (5.4) 

The  solutions  are  given  by  (4.5)  and  (4.7),  where 

•• 

GiuM)~  no+mr1  J y"-1  (k2+  ^ log(  1  +y«, )  +  Af  log(  1  +y ) +K,) rfy 
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Most  of  the  integrals  in  this  section,  which  require  numerical  evaluation,  could  be  replaced 
by  convergent  infinite  series.  However,  this  was  not  found  to  be  computationally  advantageous. 


6.  EXAMPLES 


I  now  apply  the  techniques  proposed  here  to  three,  previously  analyzed,  software  reliability 
data  sets. 

Example  1:  Goel  and  Okumoto  (1979)  gave  the  31  failure  times  of  a  piece  of  software 
developed  as  part  of  the  Naval  Tactical  Data  System.  The  Bayes  factors  for  comparing  the 
models  considered  in  this  paper  are  shown  in  Table  1.  As  explained  in  Section  2,  these  were 
obtained  as  quotients  of  the  Bayes  factors  for  the  constant  rate  Poisson  process  against  each  of 
the  models  individually,  given  by  (4.2).  The  necessary  single  and  double  numerical  integrations 
were  carried  out  using  the  IMSL  routines  DCADRE  and  DBLIN,  respectively. 


Table  1  about  here 

For  the  WOS  model  (5.1),  only  distributions  with  tails  at  least  as  heavy  as  exponential  were 
considered,  and  />(0)  was  taken  to  be  uniform  between  |  and  1.  9=}  corresponds  to  a  quite 
heavy-tailed  distribution,  while  0  =  1  is  the  exponential  distribution.  With  this  prior,  the  WOS 
model  can  be  thought  of  as  representing  a  situation  where  the  bugs  become  harder  to  detect  as 
the  debugging  process  proceeds. 

For  the  POS  model  (5.3),  the  prior  distribution  of  0  was  given  by  (5.4)  with  iq  =  2  and 
k2= j,  so  that  about  95%  of  the  prior  distribution  of  0  was  concentrated  between  \  and  10.  0  =  j 
in  (5.2)  corresponds  to  a  heavy-tailed  distribution  for  £/»  while  0=10  corresponds  to  a 
distribution  for  which  is  close  to  normality. 


Table  1  shows  that  no  model  performs  markedly  better  than  any  other.  Indeed,  the  EOS 
model,  originally  proposed  for  this  data  by  Jelinski  and  Moranda  (1972),  seems  quite  acceptable. 

Example  2:  Meinhold  and  Singpurwalla  (1983)  gave  the  136  failure  times  of  a  real-time 
command  and  control  system,  and  analyzed  them  using  the  EOS  model.  The  same  priors  are 
used  as  in  Example  1.  The  Bayes  factors  in  Table  1  suggest  that  the  WOS  model  is  better  than 
both  the  EOS  and  POS  models.  The  posterior  distribution  of  M  for  the  EOS  and  WOS  models  is 
shown  in  Figure  1,  and  salient  features  are  summarised  in  Table  2.  It  appears  that  the  EOS  model 
substantially  underestimates  the  number  of  faults  still  present. 


Example  3:  Forman  and  Singpurwalla  (1977)  analyzed  a  data  set  consisting  of  107  failures 
using  the  EOS  model.  The  priors  used  are  the  same  as  in  the  first  two  examples.  The  data  were 
grouped,  and  I  distributed  the  failures  randomly  according  to  a  uniform  distribution  over  the 
time  intervals  in  which  they  occurred.  The  conclusions  of  all  the  model  comparisons  were  the 
same  for  each  of  four  different  sequences  of  random  numbers  used  to  distribute  the  failure  times; 
the  results  reported  here  are  for  one  of  these. 


The  WO S  model  was  again  the  preferred  one.  There  were  other  signs  of  the  inadequacy  of 
the  EOS  model.  For  example,  after  99  of  the  107  recorded  failures,  the  probability  of  eight  or 
more  failures  occurring  was  less  than  10-4  under  the  EOS  model,  but  0.18  under  the  WOS 
model. 

The  posterior  distributions  of  the  number  of  remaining  faults  under  the  EOS  and  WOS 
models  are  shown  in  Figure  2.  The  EOS  model  gave  rather  optimistic  estimates  of  the  state  of 
the  system.  For  example,  under  the  EOS  model,  the  probability  of  the  system  having  been  fully 
debugged  was  0.95,  while  under  the  WOS  model  it  was  only  0.27. 


Figure  2  about  here 


In  addition  to  its  capacity  for  representing  slowly  decreasing  failure  rates,  the  WOS  model 
can  also  represent  failure  rates  which  increase  and  then  decrease,  when  9>1  in  (5.1).  This 
possibility  has  not  been  exploited  here,  but  Littlewood  and  Verrall  (1981)  and  Ascher  and 
Feingold  (1984,  pp.110-111)  have  described  software  reliability  data  sets  of  which  this  is  a 
feature. 
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Table  1.  logjo {Bayes factor) 
for  the  model  comparisons  in  Examples  1,2,3. 


Example 


Comparison 


1 


2 


3 


Table  2.  Features  of  the  posterior  distribution  ofM,  the  number  of 
remaining  bugs,  under  the  EOS  and  WOS  models,  in  Examples  2  and  3. 


Example 

Model 

Feature 

Mode 

Median 

P[M=Q\t] 

95%  HPDR 

2 

EOS 

6 

6.5 

.01 

1-16 

WOS 

27 

40.7 

.00 

6-122 

3 

EOS 

0 

.0 

.95 

0 

WOS 

1 

.9 

.27 

0-6 

NOTE:  95%  HPDR  is  the  95%  highest  posterior  density  region,  i-j  denotes  the  set  of 
integers  from  i  to  j  inclusive. 


Table  2.  Features  of  the  posterior  distribution  of  M,  the  number  of 
remaining  bugs,  under  the  EOS  and  WOS  models,  in  Examples  2  and  3. 
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NOTE:  95%  HPDR  is  the  95%  highest  posterior  density  region,  i-j  denotes  the  set  of 
integers  from  i  to  j  inclusive. 


Captions  for  Figures  1  and  2: 


Figure  1 .  Posterior  distributions  of  M ,  the  number  of  remaining  bugs,  in  Example  2  under  (a) 
the  EOS  model,  and  (b)  the  WOS  model.  The  WOS  model,  which  is  favored  by  the  data,  estimates 
a  much  larger  number  of  remaining  bugs  than  the  EOS  model. 

Figure  2.  Posterior  distributions  of  M  in  Example  3  under  (a)  the  EOS  model,  and  (b)  the  WOS 
model.  Under  the  EOS  model,  almost  the  entire  posterior  distribution  of  M  is  concentrated  at  0, 
while  from  the  WOS  model,  which  is  favored  by  the  data,  it  appears  that  there  may  be  up  to  six 
remaining  bugs  with  non-negligeable  probability. 
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